Packages and Previous Assigments

A1 and A2

Material from A1

Basic Information

Getting the Required packages

Getting the Data and doing some basic analysis

Extracting the samples from the data

Here it shows that we do not have repeated genes

#Begin Analysis

Create a Box Plot for Non-normalized data

Desnsity distribtion for Non-Normalized Data

Apply TMM to our data

Graphing the normalized data

The biological coeffient of variation (BVC) of our data to see how much it varies

Finally we have the Normalized DataSet

Time to Map the essemble IDs to HUGO symbols

Questions and answers

Q: What are the control and test conditions of the dataset? The control conditon are the cells that were treated with DMSO. There were a couple test conditions but in here I chose to focus on isoginkgetin treated cells.

Q:Why is the dataset of interest to you? I was intrested in the magnament of telomeres as they are important in understanding aging, cell health, cancer, the biological immortality seen in the species of jellyfish Turritopsis dohrnii also known as the immortal jellyfish and to better understand animal telomarse regulation

Q:Were there expression values that were not unique for specific genes? How did you handle these? No

Q:Were there expression values that could not be mapped to current HUGO symbols? Yes, there were 181, I removed them here but added them later in A2 and A3

Q:How many outliers were removed? There were 60106 we stated with and when we filtred out low counts we ened up with 15082 thus we had 45024 outliers removed Q:How did you handle replicates? There were no replicates

The coverage is pretty good I decied to only look at the test case for the cells trated with isoginkgetin. There are 3 replicates for both this test case and for the control case.

References

(Tseng et al. 2015) (Tseng et al. 2015) (“Symbol Report for,” n.d.) (Database, n.d.) (MD, DJ, and GK 2010) (Ritchie et al. 2018) (Zhu et al. 2008) (Morgan 2019) (Xie 2020) (Zhu et al. 2008) (Efron and Tibshirani 2019) (Lang and CRAN team 2019) (Rowe 2016) (Chen 2018) (Gu, Eils, and Schlesner 2016) (Gu et al. 2014) (Gable et al. 2019) (Bioinformatics 2019) Data was from GEO with ID ‘GSE72055’

Differential Gene Expression

The MDS Plot from A1

Create our data matrix

Getting the P values

We will be applying empircal Bayes to compute differential expression for the data model we created above to get the P values

How many gene pass the threshold p-value < 0.05 in both the unadjusted and adjusted p-values?

Finding the Pvalue with a different model using the Limma package

Compare the results from the two different models

Set up our edgeR objects

Compare the results from the two different models

MA PLot

Differential Gene Expression

Conduct differential expression analysis with your normalized expression set from Assignment #1. Define your model design to be used to calculate differential expression - revisit your MDS plot from Assignment #1 to demonstrate your choice of factors in your model.

Calculate p-values for each of the genes in your expression set. How many genes were significantly differentially expressed? What thresholds did you use and why? Based of the P values 12739 genes where signifcantly diffentailly expressed

Multiple hypothesis testing - correct your p-values using a multiple hypothesis correction method. Which method did you use? And Why? How many genes passed correction?

I used the Emprical Bayes method, Based of the adjusted P values 12643 genes where signifcantly diffentailly expressed

Based off the adjusted P values 12643 genes where signifcantly diffentailly expressed I used

Thresholded over-representation analysis

Upreglauted and downregulated genes

Threshold number of genes

Running thresholded gene set enrichment analysis with gProfiler

I ran an analysis of the upregualed genes the downregulated genes and finally all of them together using gprofiler https://biit.cs.ut.ee/gprofiler/gost

Here are the downregulated gene results

Here are the upregulataed gene results

The upregulated genes seem to to have fewer terms show up with as many hits and when run together the results seem to favor the genes that have been downregulated

Q:Which method did you choose and why? I used g:profiler since it worked really well with the homework assigment and has a large number of datasets it can pull from Q:What annotation data did you use and why? What version of the annotation are you using? I used the following annoted data sets: GO molecular function GO biological process Reactome WikiPathways

Q:How many genesets were returned with what thresholds?

Do the over-representation results support conclusions or mechanism discussed in the original paper? These results seem to support the findings in the paper many of the terms that were returned for the downregulated gene sets seeme to invlvoved in RNA regualtion such as RNA Polymerase II Transcription, ATP binding,ribonucleotide binding suggesting that isoginkgetin does indeed mimic the effects of RNA exosome inhibition

References

(Tseng et al. 2015) (Tseng et al. 2015) (“Symbol Report for,” n.d.) (Database, n.d.) (MD, DJ, and GK 2010) (Ritchie et al. 2018) (Zhu et al. 2008) (Morgan 2019) (Xie 2020) (Zhu et al. 2008) (Efron and Tibshirani 2019) (Lang and CRAN team 2019) (Rowe 2016) (Chen 2018) (Gu, Eils, and Schlesner 2016) (Gu et al. 2014) (Gable et al. 2019) (Bioinformatics 2019)

library(RCurl) 
## Loading required package: bitops
#install required R and bioconductor packages
tryCatch(expr = { library("RCurl")}, 
         error = function(e) {  install.packages("RCurl")}, 
         finally = library("RCurl"))

#use library
tryCatch(expr = { library("limma")}, 
         error = function(e) { source("https://bioconductor.org/biocLite.R")
           biocLite("limma")}, 
         finally = library("limma"))
tryCatch(expr = { library("Biobase")}, 
         error = function(e) { source("https://bioconductor.org/biocLite.R")
           biocLite("Biobase")}, 
         finally = library("Biobase"))
tryCatch(expr = { library("ggplot2")}, 
         error = function(e) { install.packages("ggplot2")}, 
         finally = library("ggplot2"))

#For creating json and communicating with cytoscape
tryCatch(expr = { library("httr")}, 
         error = function(e) { install.packages("httr")}, 
         finally = library("httr"))
## 
## Attaching package: 'httr'
## The following object is masked from 'package:Biobase':
## 
##     content
tryCatch(expr = { library("RJSONIO")}, 
         error = function(e) { install.packages("RJSONIO")}, 
         finally = library("RJSONIO"))
if (!requireNamespace("GSAr", quietly = TRUE))
    install.packages("GSA")
## Installing package into '/usr/local/lib/R/site-library'
## (as 'lib' is unspecified)
if (!requireNamespace("VennDiagram", quietly = TRUE))
    install.packages("VennDiagram")

if (!requireNamespace("futile.logger", quietly = TRUE))
    install.packages("futile.logger")

library(futile.logger)
library(VennDiagram)
library(GSA)

The Start of A3

Introduction

In the paper “Human telomerase RNA processing and quality control” I took the data set that compared The cells that were treated with DMSO(the control) isoginkgetin treated cells and prepared it for analysis. One of the results of the paper was that isoginkgetin behaves as an RNA exosome inhibitor leaving long telomerase RNA transcripts (htr). In A2 I did both threshold analysis and over representation analysis to find the genes that were unregulated and down-regulated and what terms seem to show up and are related to these genes.

Non-thresholded Gene set Enrichment Analysis

Downloading the geneset

gmt_url = "http://download.baderlab.org/EM_Genesets/current_release/Human/symbol/"
# list all the files on the server
filenames = getURL(gmt_url)
tc = textConnection(filenames)
contents = readLines(tc)
close(tc)
# get the gmt that has all the pathways and does not include terms inferred from
# electronic annotations(IEA) start with gmt file that has pathways only
rx = gregexpr("(?<=<a href=\")(.*.GOBP_AllPathways_no_GO_iea.*.)(.gmt)(?=\">)", contents, 
    perl = TRUE)
gmt_file = unlist(regmatches(contents, rx))
dest_gmt_file <- file.path(gmt_file)
download.file(paste(gmt_url, gmt_file, sep = ""), destfile = dest_gmt_file)
#Creating a rank file
qlf_output_hits_withgn_rank <- qlf_output_hits_withgn[order(qlf_output_hits_withgn$rank, decreasing = TRUE),]

#unrank get to the bottom of this
#gone_file <- qlf_output_hits_withgn_rank[c("ID","rank")]
#qlf_output_hits_withgn_rank$hgnc_symbol[7]<-"nothgnac_ ENSG00000276216"
#somehow didn't have porblem in A1

rank_file <- qlf_output_hits_withgn_rank[c("ID","rank")]

write.table(x=rank_file,
            file=file.path("data","tel_exp.rnk"),sep = "\t",
            row.names = FALSE,col.names = FALSE,quote = FALSE)

I had to load the rnk file we created above into GSEA_4.0.3

Afterward we click on RUN GSEA on Pre-Ranked gene list

I put the gmt folder we downloaded into the Gene set database

I put our ranked list in the Ranked list

I had collapase in the next field and then the chip platform was “ftp.broadinstitute.org://pub/gsea/annotations_versioned/Human_ENSEMBL_Gene_ID_MSigDB.v7.0.chip” to convert my ranked list essamble IDs for GSEA to use(I kept getting errors when I tried anything else) Max size was set to 200 and everything else was left as default

I used A2/Human_GOBP_AllPathways_no_GO_iea_April_01_2020_symbol.gmt for the data set

Summary of enrichment results

Enrichment in phenotype: na 1767 / 4857 gene sets are upregulated in phenotype na_pos 569 gene sets are significant at FDR < 25% 397 gene sets are significantly enriched at nominal pvalue < 1% 525 gene sets are significantly enriched at nominal pvalue < 5%

Enrichment in phenotype: na 3090 / 4857 gene sets are upregulated in phenotype na_neg 1186 gene sets are significantly enriched at FDR < 25% 536 gene sets are significantly enriched at nominal pvalue < 1% 926 gene sets are significantly enriched at nominal pvalue < 5%

There are more down-regulated process than up regulated process, but outside of that it is not a strait-forward comparison. The GSEA gave process and returned if they had a fold change or not, In A2 we found pout which genes were up-regulated and which ones were not, although it would make sense that more genes being up or down regulated would imply more process with a positive or negative fold change, processes involve more genes working together and interactions make it so its not a strait forward comparison.In addition when I used g:Profiler in A2 I got terms and these terms or simply what the genes are involved with and not how they affect these process.

Visualize your Gene set Enrichment Analysis in Cytoscape

FDR q-value cutoff was set to 0.01.

Here is the inital network without any manual layout The Network was annottated using the default paramnters as follows The entire network was annotated and the Label colunm was set to GS_DESCR(The Gene Set Description)

Here is the Publication Ready Figure Here is the theme network

Many of the themes seen here realte to either to the cell cyle suck a chromosome segragation, hallmakr miotic spindle and the postive cell cyle themes and the others relate to mRNA translation such as protien localization endoplasmic. This fits the model quite well as the telomerase RNA (hTR) is involved in the cell cyle and the inhibition of the RNA exosome will effect protein expression, RNA translation and in genral RNA proceesing and regualtion.

Interptation

1.Do the enrichment results support conclusions or mechanism discussed in the original paper? How do these results differ from the results you got from Assignment #2 thresholded methods

The enrichment results appear to support the results from the paper. The GSEA analysis following pathway with positive fold changes Eukaryotic Translation Termination which is very much in line with the results of the paper, a lack of translation would make sense and would support isoginkgetin acting as a RNA exosome inhibition. Another pathway that has a positive fold change is Response Of Eif2ak4 (Gcn2) To Amino Acid Deficiency. Once again if RNA Exosome inhibition is taking place then low levels of RNA degradation would imply low levels of Amino Acids thus this positive fold change of this process would makes sense. The genes of interest that were known parts of the RNA Exomsome are hRRP40(which was down-regulated as expected), DIS3(which did not show significant up-regualtion or down-regulation) and ZCCHC8(which was up-regulated).ANAPC5, one of my up-regulated genes found in A2 promotes anaphase [https://www.genenames.org/data/gene-symbol-report/#!/hgnc_id/HGNC:15713] which somewhat contrast with GSEA results where Mitotic Nuclear Division and Positive Regulation Of Cell Cycle Phase Transition were processes which had a negative fold change. From the first subnetwork I created using the Default filter for 2.5 3.792 and -2.601 and -2.5 my largest cluster was “gli3 processed degradation”. Gli3 is a zinc finger protein just like ZCCHC8 one of the genes of interest in the paper[https://www.genecards.org/cgi-bin/carddisp.pl?gene=GLI3] [https://www.genenames.org/data/gene-symbol-report/#!/hgnc_id/HGNC:25265]. Gli3 helps regulated gene expression. During RNA exosome inhibition it is possible that gene expression would go down due to the lack of Amino Acids from the RNA transcripts not being broken down. This idea is also supported by the fact Response Of Eif2ak4 (Gcn2) To Amino Acid Deficiency has a positive fold change according to the GSEA results. Thus the process involved in the degradation of a gene expression regulator would make sense and could explain why ZCCHC8 was found to be up-regulated in A2 and would in turn help support the findings of the original paper. Originally the Up-regulation of ZCCHC8 a gene involved in the exosome appeared to contradict the results but the pathway enrihcment analysis made it clear why. Cytoscape helps expand upon the findings in A2 because it goes over the pathways and process involved with the genes and can further elucidate some of the reactions seen. hRRP40(hgnc EXOSC3) is a down-regualted gene in the network it was clustered with protein translation targeting in the subnetwork. DIS3 another gene that the paper used to show exosome inhibition was also found to be in the protein cluster in my subnetwork. This differs a bit from the A2 because DIS3 (ENSG00000083520) was not found to be down-regulated or up-regulated and had a rank that was around the middle(7251).

From the Subnetwork I created using the Default filter for 3.3 3.792 and -2.601 and -2.3 the cluster with the largest number of nodes was Protein localization endoplasmic with 22 different nodes, which included pathways such as EUKARYOTIC TRANSLATION TERMINATION, EUKARYOTIC TRANSLATION ELONGATION and EUKARYOTIC TRANSLATION INITIATION. Pathways I would expect to be involved with longer htR transcripts being built up by lack of degradation via RNA exosome inhibition.

2.Can you find evidence, i.e. publications, to support some of the results that you see. How does this evidence support your result?

“ZCCHC8, the nuclear exosome targeting component, is mutated in familial pulmonary fibrosis and is required for telomerase RNA maturation” it was demonstrated that for htr transcripts to become mature via ZCCHC8, because the exosome is depleted it its possible that ZCCHC8 is up regulated due to more long form htr transcripts accumulating.(Gable et al. 2019)

Sometimes the most interesting information is the gene that has no information. In this type of pathway analysis we can only discover what we have already described previously in the literature or pathway databases. Often pathways found in one disease are applicable to other diseases so this technique can be very helpful. It is important to highlight any genes that are significantly differentially expressed in your model but are not annotated to any pathways. We refer to this set of genes as the dark matter.

Dark Matter

gmt_file <- file.path(getwd(),"Human_GOBP_AllPathways_no_GO_iea_April_01_2020_symbol.gmt")
capture.output(genesets<-GSA.read.gmt(gmt_file),file="gsa_load.out")
names(genesets$genesets) <- genesets$geneset.names
#Quickely fix our expression file
library(biomaRt)
listMarts()
##                biomart               version
## 1 ENSEMBL_MART_ENSEMBL      Ensembl Genes 99
## 2   ENSEMBL_MART_MOUSE      Mouse strains 99
## 3     ENSEMBL_MART_SNP  Ensembl Variation 99
## 4 ENSEMBL_MART_FUNCGEN Ensembl Regulation 99
listEnsemblArchives()[1:10,]
##              name     date                                url version
## 1  Ensembl GRCh37 Feb 2014          http://grch37.ensembl.org  GRCh37
## 2      Ensembl 99 Jan 2020 http://jan2020.archive.ensembl.org      99
## 3      Ensembl 98 Sep 2019 http://sep2019.archive.ensembl.org      98
## 4      Ensembl 97 Jul 2019 http://jul2019.archive.ensembl.org      97
## 5      Ensembl 96 Apr 2019 http://apr2019.archive.ensembl.org      96
## 6      Ensembl 95 Jan 2019 http://jan2019.archive.ensembl.org      95
## 7      Ensembl 94 Oct 2018 http://oct2018.archive.ensembl.org      94
## 8      Ensembl 93 Jul 2018 http://jul2018.archive.ensembl.org      93
## 9      Ensembl 92 Apr 2018 http://apr2018.archive.ensembl.org      92
## 10     Ensembl 91 Dec 2017 http://dec2017.archive.ensembl.org      91
##    current_release
## 1                 
## 2                *
## 3                 
## 4                 
## 5                 
## 6                 
## 7                 
## 8                 
## 9                 
## 10
ensembl <- useMart("ensembl")
datasets <- listDatasets(ensembl)
kable(head(datasets),format = "html")
dataset description version
acalliptera_gene_ensembl Eastern happy genes (fAstCal1.2) fAstCal1.2
acarolinensis_gene_ensembl Anole lizard genes (AnoCar2.0) AnoCar2.0
acchrysaetos_gene_ensembl Golden eagle genes (bAquChr1.2) bAquChr1.2
acitrinellus_gene_ensembl Midas cichlid genes (Midas_v5) Midas_v5
amelanoleuca_gene_ensembl Panda genes (ailMel1) ailMel1
amexicanus_gene_ensembl Mexican tetra genes (Astyanax_mexicanus-2.0) Astyanax_mexicanus-2.0
kable(head(datasets[grep(datasets$dataset,pattern = "sapiens"),]),format = "html")
dataset description version
78 hsapiens_gene_ensembl Human genes (GRCh38.p13) GRCh38.p13
ensembl = useDataset("hsapiens_gene_ensembl",mart=ensembl)

biomart_human_filters <- listFilters(ensembl)



conversion_stash <- "tel_exp_conversion.rds"
if(file.exists(conversion_stash)){
  exp_conversion <- readRDS(conversion_stash)
} else {
  exp_conversion <- getBM(attributes = c("ensembl_gene_id","hgnc_symbol"),
                            filters = c("ensembl_gene_id"),
                            values = exp_conversion$ID,
                            mart = ensembl)
  saveRDS(exp_conversion, conversion_stash)
}

nrow(expression) - nrow(exp_conversion)
## integer(0)
dim(exp_conversion)
## [1] 14901     2
#Getting the expression data and the rnk file we created
#expression <- read.table(file.path(getwd(),"data","express.txt"), header = TRUE, sep = "\t", quote="\"", stringsAsFactors = FALSE)
expression <- exp_conversion
ranks <- read.table(file.path(getwd(),"data","tel_exp.rnk"), header = TRUE, sep = "\t", quote="\"", stringsAsFactors = FALSE)
#get all the GSEA directories
gsea_directories <- list.files(path = file.path(getwd(),"data"),
 pattern = "\\.GseaPreranked")
if(length(gsea_directories) == 1){
 gsea_dir <- file.path(getwd(),"data",gsea_directories[1])
 #get the gsea result files
 gsea_results_files <- list.files(path = gsea_dir, pattern = "gsea_report_*.*.xls")
 #there should be 2 gsea results files
 enr_file1 <-read.table(file.path(gsea_dir,gsea_results_files[1]), header = TRUE, sep = "\t", quote="\"", stringsAsFactors = FALSE,row.names=1)
 enr_file2 <-read.table(file.path(gsea_dir,gsea_results_files[1]), header = TRUE, sep = "\t", quote="\"", stringsAsFactors = FALSE,row.names=1)
}

#get the genes from the set of enriched pathway 
all_enr_genesets<- c(rownames(enr_file1), rownames(enr_file2))
genes_enr_gs <- c()
for(i in 1:length(all_enr_genesets)){
 current_geneset <-
unlist(genesets$genesets[which(genesets$geneset.names %in%
all_enr_genesets[i])])
 genes_enr_gs <- union(genes_enr_gs, current_geneset)
}
FDR_threshold <- 0.01
#get the genes from the set of enriched pathwasy 
all_sig_enr_genesets<- c(rownames(enr_file1)
[which(enr_file1[,"FDR.q.val"]<=FDR_threshold)],
rownames(enr_file2)[which(enr_file2[,"FDR.q.val"]<=FDR_threshold)])
genes_sig_enr_gs <- c()
for(i in 1:length(all_sig_enr_genesets)){
 current_geneset <-unlist(genesets$genesets[which(genesets$geneset.names %in% all_sig_enr_genesets[i])])
 genes_sig_enr_gs <- union(genes_sig_enr_gs, current_geneset)
}

genes_all_gs <- unique(unlist(genesets$genesets))
A <- genes_all_gs
B <- genes_enr_gs
C <- expression[,2]
png(file.path(getwd(),"data","dark_matter_overlaps.png"))
draw.triple.venn( area1=length(A), area2=length(B), area3 =
length(C), n12 = length(intersect(A,B)), n13=length(intersect(A,C)), n23 = length(intersect(B,C)), n123 = length(intersect(A,intersect(B,C))), category = c("all genesets","all enrichment results","expression"), fill = c("red","green","blue"), cat.col = c("red","green","blue"))
## (polygon[GRID.polygon.11], polygon[GRID.polygon.12], polygon[GRID.polygon.13], polygon[GRID.polygon.14], polygon[GRID.polygon.15], polygon[GRID.polygon.16], text[GRID.text.17], text[GRID.text.18], text[GRID.text.19], text[GRID.text.20], text[GRID.text.21], text[GRID.text.22], text[GRID.text.23], text[GRID.text.24])

#Convert back to essambleIds and see which ranked genes have no annotation
genes_no_annotation <- setdiff(expression[,2], genes_all_gs)
exp_out <- expression[which(expression[,2] %in% genes_no_annotation),]
final <- exp_out[,1]
ranked_gene_no_annotation <- ranks[which(ranks[,1] %in% final),]
ranked_gene_no_annotation[1:10,]
##    ENSG00000171223 X22.0553924156636
## 16 ENSG00000236824          18.45039
## 18 ENSG00000270959          18.40687
## 19 ENSG00000142871          18.40654
## 23 ENSG00000272734          18.07540
## 26 ENSG00000189060          17.97792
## 30 ENSG00000188483          17.74155
## 36 ENSG00000196428          17.31447
## 41 ENSG00000153714          17.19826
## 43 ENSG00000245532          17.16323
## 51 ENSG00000260339          16.97883
dim(ranked_gene_no_annotation)
## [1] 3253    2

There were only 3253 genes that didn’t have an annotation

The first result(BCYRN1) I checked Uniprot here https://www.uniprot.org/uniprot/I7G5M5 and found it only in mice at the Experimental evidence at transcript level with a low annotation score.

ranked_gene_no_annotation <- ranks[which(ranks[,1] %in% final),]
not_annotated <- ranked_gene_no_annotation[,1]
normalized_count_data <-  NormalizedData[which(NormalizedData[,1] %in% not_annotated),]
kable(normalized_count_data[1:5,1:5], type="html")
ensembl_gene_id hgnc_symbol dmso12_1 dmso12_2 dmso12_3
4 ENSG00000000460 C1orf112 71.792801 72.7708696 74.955179
18 ENSG00000002079 MYH16 1.184483 0.8344486 1.373174
30 ENSG00000003249 DBNDD1 9.048130 10.3610698 8.975866
49 ENSG00000004766 VPS50 59.059614 56.7772719 61.055984
63 ENSG00000005059 MCUB 19.280743 23.7470157 21.803316
heatmap_matrix <- normalized_count_data[,3:ncol(normalized_count_data)]
rownames(heatmap_matrix) <- normalized_count_data$ensembl_gene_id
colnames(heatmap_matrix) <- colnames(normalized_count_data[,3:ncol(normalized_count_data)])

Heatmap of unannotated genes

if(min(heatmap_matrix) == 0){
    heatmap_col = colorRamp2(c( 0, max(heatmap_matrix)), c( "white", "red"))
  } else {
    heatmap_col = colorRamp2(c(min(heatmap_matrix), 0, max(heatmap_matrix)), c("blue", "white", "red"))
  }
current_heatmap <- Heatmap(as.matrix(heatmap_matrix),
                               show_row_dend = TRUE,
                               show_column_dend = TRUE, 
                               col=heatmap_col,
                               show_column_names = TRUE, 
                               show_row_names = FALSE,
                               show_heatmap_legend = TRUE
                               )

heatmap_matrix <- t(scale(t(heatmap_matrix)))
if(min(heatmap_matrix) == 0){
    heatmap_col = colorRamp2(c( 0, max(heatmap_matrix)), c( "white", "red"))
  } else {
    heatmap_col = colorRamp2(c(min(heatmap_matrix), 0, max(heatmap_matrix)), c("blue", "white", "red"))
  }
current_heatmap <- Heatmap(as.matrix(heatmap_matrix),
                               show_row_dend = TRUE,
                               show_column_dend = TRUE, 
                               col=heatmap_col,
                               show_column_names = TRUE, 
                               show_row_names = FALSE,
                               show_heatmap_legend = TRUE
                               )
current_heatmap

References

(Tseng et al. 2015) (“Symbol Report for,” n.d.) (Database, n.d.) (MD, DJ, and GK 2010) (Ritchie et al. 2018) (Zhu et al. 2008) (Morgan 2019) (Xie 2020) (Zhu et al. 2008) (Efron and Tibshirani 2019) (Lang and CRAN team 2019) (Rowe 2016) (Chen 2018) (Gu, Eils, and Schlesner 2016) (Gu et al. 2014) (Gable et al. 2019) (Bioinformatics 2019)

Bioinformatics, UniProt ConsortiumEuropean Bioinformatics InstituteProtein Information ResourceSIB Swiss Institute of. 2019. UniProt ConsortiumEuropean Bioinformatics InstituteProtein Information ResourceSIB Swiss Institute of Bioinformatics. https://www.uniprot.org/uniprot/I7G5M5.

Chen, Hanbo. 2018. VennDiagram: Generate High-Resolution Venn and Euler Plots. https://CRAN.R-project.org/package=VennDiagram.

Database, Gene. n.d. “GLI3 Gene (Protein Coding).” GeneCards. https://www.genecards.org/cgi-bin/carddisp.pl?gene=GLI3.

Efron, Brad, and R. Tibshirani. 2019. GSA: Gene Set Analysis. https://CRAN.R-project.org/package=GSA.

Gable, Dustin L., Valeriya Gaysinskaya, Christine C. Atik, C. Conover Talbot, Byunghak Kang, Susan E. Stanley, Elizabeth W. Pugh, et al. 2019. “ZCCHC8 the Nuclear Exosome Targeting Component Is Mutated in Familial Pulmonary Fibrosis and Is Required for Telomerase Rna Maturation.” Genes & Development 33 (19-20): 1381–96. https://doi.org/10.1101/gad.326785.119.

Gu, Zuguang, Roland Eils, and Matthias Schlesner. 2016. “Complex Heatmaps Reveal Patterns and Correlations in Multidimensional Genomic Data.” Bioinformatics.

Gu, Zuguang, Lei Gu, Roland Eils, Matthias Schlesner, and Benedikt Brors. 2014. “Circlize Implements and Enhances Circular Visualization in R.” Bioinformatics 30 (19): 2811–2.

Lang, Duncan Temple, and the CRAN team. 2019. RCurl: General Network (Http/Ftp/...) Client Interface for R. https://CRAN.R-project.org/package=RCurl.

MD, Robinson, McCarthy DJ, and Smyth GK. 2010. “edgeRedgeR a Bioconductor Package for DifferentialExpression Analysis of Digital Gene Expression Data.” Bioinformatics 26 (7): 139–40.

Morgan, Martin. 2019. BiocManager: Access the Bioconductor Project Package Repository. https://CRAN.R-project.org/package=BiocManager.

Ritchie, Matthew E, Belinda Phipson, Di Wu, Yifang Hu, Charity W Law, Wei Shi, and Gordon K Smyth. 2018. “limma Powers Differential Expression Analyses for RNA-Sequencing and Microarray Studies.” Molecular Neurodegeneration 43 (7): e47. https://doi.org/10.1093/nar/gkv007.

Rowe, Brian Lee Yung. 2016. Futile.logger: A Logging Utility for R. https://CRAN.R-project.org/package=futile.logger.

Tseng, Chi-Kang, Hui-Fang Wang, Allison M. Burns, Morgan R. Schroeder, Martina Gaspari, and Peter Baumann. 2015. “Human Telomerase Rna Processing and Quality Control.” Cell Reports 13 (10): 2232–43. https://doi.org/10.1016/j.celrep.2015.10.075.

Xie, Yihui. 2020. Knitr: A General-Purpose Package for Dynamic Report Generation in R. https://yihui.org/knitr/.

Zhu, Yuelin, Sean Davis, Robert Stephens, Paul S. Meltzer, and Yidong Chen. 2008. “GEOmetadb: Powerful Alternative Search Engine for the Gene Expression Omnibus.” Bioinformatics (Oxford, England) 24 (23): 2798–2800. https://doi.org/10.1093/bioinformatics/btn520.